Skip to content

Refactoring#245

Open
ArthurDeclercq wants to merge 77 commits intomainfrom
refactoring
Open

Refactoring#245
ArthurDeclercq wants to merge 77 commits intomainfrom
refactoring

Conversation

@ArthurDeclercq
Copy link
Contributor

@ArthurDeclercq ArthurDeclercq commented Jan 9, 2026

No description provided.

ArthurDeclercq and others added 30 commits February 24, 2024 15:48
pull main in spectrum-feature-generator

# Fit calibration and transform all predictions for this run
calibration = SplineTransformerCalibration()
calibration.fit(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_calibration_data doesn't throw an exception if there are no target psms, contrary to _get_calibration_psms. However, unless I'm missing something, fit will still fail if observed_rt_calibration or predicted_rt_calibration are empty. So it should probably be handled here if one of the runs doesn't have any target psms.

# Calibrate predictions per run
logger.info("Calibrating predicted retention times per run...")
for run in psm_list_df["run"].unique():
run_df = psm_list_df[psm_list_df["run"] == run].copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_df is copied twice, here, and in _get_calibration_data. On top of that, if it would be okay to sort psm_list_df, we could also avoid copying run_df at all, and we could just use:

        calibration_df = run_df[~run_df["is_decoy"]].head(num_calibration_psms)

I suspect sorting all psms, instead of only the targets might be faster than copying all targets if len(target_df) is considerably greater than num_calibration_psms. But of course, if psm_list_df shouldn't be sorted in place, then you can probably ignore this, except for the double copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants